AITopics | asymptotic normality and confidence interval

Collaborating Authors

asymptotic normality and confidence interval

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Asymptotic normality and confidence intervals for derivatives of 2-layers neural network in the random features model

Neural Information Processing SystemsDec-24-2025, 17:30:25 GMT

This paper studies two-layers Neural Networks (NN), where the first layer contains random weights, and the second layer is trained using Ridge regularization. This model has been the focus of numerous recent works, showing that despite its simplicity, it captures some of the empirically observed behaviors of NN in the overparametrized regime, such as the double-descent curve where the generalization error decreases as the number of weights increases to $+\infty$. This paper establishes asymptotic distribution results for this 2-layers NN model in the regime where the ratios $\frac p n$ and $\frac d n$ have finite limits, where $n$ is the sample size, $p$ the ambient dimension and $d$ is the width of the first layer. We show that a weighted average of the derivatives of the trained NN at the observed data is asymptotically normal, in a setting with Lipschitz activation functions in a linear regression response with Gaussian features under possibly non-linear perturbations. We then leverage this asymptotic normality result to construct confidence intervals (CIs) for single components of the unknown regression vector. The novelty of our results are threefold: (1) Despite the nonlinearity induced by the activation function, we characterize the asymptotic distribution of a weighted average of the gradients of the network after training; (2) It provides the first frequentist uncertainty quantification guarantees, in the form of valid ($1\text{-}\alpha$)-CIs, based on NN estimates; (3) It shows that the double-descent phenomenon occurs in terms of the length of the CIs, with the length increasing and then decreasing as $\frac d n\nearrow +\infty$ for certain fixed values of $\frac p n$. We also provide a toolbox to predict the length of CIs numerically, which lets us compare activation functions and other parameters in terms of CI length.

2-layer neural network, asymptotic normality and confidence interval, derivative, (8 more...)

Neural Information Processing Systems

Genre: Research Report (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

d87ca511e2a8593c8039ef732f5bffed-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 17:01:57 GMT

activation function, arxiv preprint arxiv, neural network, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Review for NeurIPS paper: Asymptotic normality and confidence intervals for derivatives of 2-layers neural network in the random features model

Neural Information Processing SystemsFeb-12-2025, 00:01:25 GMT

The reviewers point out that this is a borderline submission. They reasonably questions several things in the paper: - it is not clear why the coefficients for which the CLT holds for are important; - assumptions are restrictive; - the paper studies too simplistic of a model; - parts of the analysis are unclear; - writing is done hastily with typos lingering around. After my own reading, I agree with these comments. On the other hand, the reviewers also point out that there are certain aspects of double descent that are not previously explored, which are of more interest compared to the confidence intervals. My opinion is that the paper would be much stronger if the cons were addressed in a revised manuscript.

2-layer neural network, asymptotic normality and confidence interval, random feature model, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Review for NeurIPS paper: Asymptotic normality and confidence intervals for derivatives of 2-layers neural network in the random features model

Neural Information Processing SystemsFeb-6-2025, 21:45:03 GMT

Additional Feedback: I will increase my score if my concerns are addressed and if the authors could correct my potential misunderstanding. 1. I find the "double descent" phenomenon in the CL length to be interesting. Intuitively, the uncertainty of the model could relate to the variance of the prediction, which we know might blow up at the interpolation threshold due to the variance from label noise or from initialization. Can the author comment on the plausible mechanism of this observation? In this case what would be the motivation of considering a nonlinear perturbation, which would basically be adding noise? 3. The result in Section 2.4 (based on Mei and Montanari 2019) seems to be under the assumption of iid weight matrix W. I might have missed something, but is there a place the authors discussed that this characterization also holds for arbitrary W (independent of X) with bounded spectral norm? 4. (minor) Does the characterization also holds for the ridgeless limit (\lambda 0)? 5. (minor) On Figure 2 Left, why is there a discrepancy between the predicted and simulated boxplot? 6. (minor) Although this is not the motivation of the work, the mentioned connection between NN and RF model typically requires significant overparameterization, and thus the current proportional scaling of n and d might not be the right setup.

2-layer neural network, asymptotic normality and confidence interval, random feature model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Asymptotic normality and confidence intervals for derivatives of 2-layers neural network in the random features model

Neural Information Processing SystemsOct-11-2024, 12:18:21 GMT

2-layer neural network, asymptotic normality and confidence interval, derivative, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback